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Abstract 

Background: High-yielding cultivars of rice {Oryza sativa L.) have been developed in Japan from crosses 
between overseas indica and domestic japonica cultivars. Recently, next-generation sequencing technology and 
high-throughput genotyping systems have shown many single-nucleotide polymorphisms (SNPs) that are proving 
useful for detailed analysis of genome composition. These SNPs can be used in genome-wide association studies 
to detect candidate genome regions associated with economically important traits. In this study, we used a custom 
SNP set to identify introgressed chromosomal regions in a set of high-yielding Japanese rice cultivars, and we 
performed an association study to identify genome regions associated with yield. 

Results: An informative set of 1 152 SNPs was established by screening 14 high-yielding or primary ancestral 
cultivars for 5760 validated SNPs. Analysis of the population structure of high-yielding cultivars showed three 
genome types: japonica-type, indica-type and a mixture of the two. SNP allele frequencies showed several regions 
derived predominantly from one of the two parental genome types. Distinct regions skewed for the presence of 
parental alleles were observed on chromosomes 1, 2, 7, 8, 1 1 and 12 (indica) and on chromosomes 1, 2 and 6 
(japonica). A possible relationship between these introgressed regions and six yield traits (blast susceptibility, 
heading date, length of unhusked seeds, number of panicles, surface area of unhusked seeds and 1000-grain 
weight) was detected in eight genome regions dominated by alleles of one parental origin. Two of these regions 
were near G/id7, a heading date locus, and Pi-ta, a blast resistance locus. The allele types (i.e., japonica or indica) of 
significant SNPs coincided with those previously reported for candidate genes GlidJ and Pi-ta. 

Conclusions: Introgression breeding is an established strategy for the accumulation of QTLs and genes controlling 
high yield. Our custom SNP set is an effective tool for the identification of introgressed genome regions from a 
particular genetic background. This study demonstrates that changes in genome structure occurred during artificial 
selection for high yield, and provides information on several genomic regions associated with yield performance. 
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Background 

Rice {Oryza sativa L.) is a staple food in Asian countries. 
The population of Asian countries has almost tripled 
over the past 50 years and today accounts for 60% of the 
world population [1]. To supply the necessary rice for 
food and other uses, rice breeders continue to look for 
new ways to increase yield. In the 1960s, high-yielding 
semi-dwarf cultivars such as IRS were first released [2]. 
Semi-dwarf cultivars were bred to be resistant to lodging 
under high nitrogen levels, which achieved their high 
yield [3]. IRS was derived from crossing the Taiwanese 
semi-dwarf cultivar Dee Geo Woo Gen, which carries 
the semi-dwarf 1 (sdl) gene, and the Indonesian cultivar 
Peta [2,4]. The sdl gene derived from IRS or other culti- 
vars has played a crucial role in the breeding of high- 
yielding rice. 

Increasing grain size or weight can improve yield by 
enlarging the sink size. Previous studies [5-10] have 
identified several genes associated with sink size and 
have shown that these genes could increase the 1000- 
grain weight of rice. The ability to produce fully mature 
seed is also an important trait for high-yielding rice be- 
cause immature seeds reduce not only total grain weight, 
but also grain quality. Recently, allelic differences in the 
rice flowering-time gene DTH2 were found to influence 
seed maturation [11]. Another gene controlling flower- 
ing time, Ghd7, has been reported to regulate traits in- 
volved in yield potential, such as plant height and the 
number of spikelets per panicle [12]. However, no single 
gene among those identified as controlling sink size can 
fully improve rice yield on its own. Therefore, the associ- 
ation between sink size genes and yield traits needs to 
be clarified by using a diverse genetic population. 

Because of the utilization of rice for forage and 
bioethanol production in Japan, the materials used for 
breeding of high-yielding rice cultivars have changed 
drastically since the mid-1980s. Semi-dwarf indica culti- 
vars have been extensively used as parental lines and 
crossed with Japanese japonica cultivars to produce 
high-yielding rice cultivars for Japan [13]. 

Thus, it is likely that the introgression of genomic re- 
gions from indica cultivars has contributed to the im- 
provement of yield and other traits such as disease 
resistance in current Japanese high-yielding rice cultivars. 

To identify the genes needed to produce high-yield 
rice cultivars, it is essential to develop genetic tools for 
the molecular dissection of current high-yielding cultivars 
and other breeding materials. Recently, next-generation 
sequencing technology has identified genome-wide single- 
nucleotide polymorphisms (SNPs) among diverse rice ac- 
cessions. Large numbers of SNPs, which are often used in 
combination with high-throughput genotyping systems, 
have been widely used for genetic diversity analyses of di- 
verse populations [14-19], breeding materials [20-22] and 



for mutation analyses [23]. Moreover, associations between 
traits and SNPs have been discovered by genome-wide as- 
sociation mapping in diverse populations [16,19,24]. These 
research platforms will allow us to elucidate genomic re- 
gions involved in yield potential and to accelerate the 
introgression of desirable genes into breeding materials. 

In the present study, we used a genome-wide associ- 
ation study (GWAS) strategy to identify genomic regions 
contributing to high yield in Japanese rice cultivars de- 
rived from indica-japonica crosses. To dissect the fine 
genomic structures of admixed cultivars, we established 
a novel SNP set from previously discovered SNPs. We 
identified regions within the high-yielding cultivars with 
high frequencies of either indica or japonica alleles and 
detected associations of these regions with traits contri- 
buting to high yield in Japanese cultivars. 

Results 

Development of a SNP set for the analysis of genomic 
constitutions of Japanese high-yielding rice cultivars 

A high-density SNP set is essential for detailed analysis 
of genome constitution. To obtain a set of informative 
SNPs evenly distributed across the rice genome, we sur- 
veyed 5760 SNPs in 14 high-yielding or primary ances- 
tral cultivars (Akenohoshi, Akihikari, Hokuriku 193, 
Hoshiyutaka, Kanto PL12, Kochihibiki, Lemont, Milyang 
23, Oochikara, Ooseto, Suweon 258, Tachisugata, Tainung 
67, and Takanari; Additional file 1: Table SI; Additional 
file 2: Table S2). A subset of 2307 informative SNPs was 
selected on the basis of no missing data and the absence 
of low-frequency alleles (i.e., those found in fewer than 3 
of the 14 cultivars). By performing a simulation for 
complete linkage disequilibrium (LD), which was esti- 
mated by comparing the genotypes of pairs of neighboring 
SNP alleles, we estimated the number of SNPs required 
for practical use in the analysis of genomic constitutions 
of Japanese high-yielding rice cultivars. As the number of 
SNPs increased, the mean value of complete LD also in- 
creased, but reached a plateau at approximately 1000 
SNPs (Figure 1). Using this information and considering 
the analytical system and cost for the analysis, we ran- 
domly selected 1152 SNPs from among the 2307 inform- 
ative SNPs as a practical number for this analysis 
(Additional file 3: Figure SI). 

Genome classification of Japanese high-yielding rice 
cultivars 

The genetic diversity of Japanese high-yielding rice culti- 
vars was estimated from the polymorphism information 
content (PIC) value and related values such as heterozy- 
gosity. These values were not strongly affected by sample 
size and were larger for samples of Japanese high- 
yielding rice cultivars (HY) than for samples of either do- 
mestic (PD) or overseas (PO) parents, which contained 
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Figure 1 Mean values of complete linkage disequilibrium 
(complete LD; 0 < < 1 ) estimated for SNP sets of different sizes. 

SNP sets of different sizes were randomly chosen across the genome 
(10 times per set size), and mean values of complete LD were 
calculated for each set size. Vertical bars show the standard error 



both japonica and indica cultivars (Table 1). They were 
also larger than those of the indica and japonica samples 
in both WRC and JRC populations {see Methods). 

Because Japanese high-yielding rice cultivars were de- 
rived from crosses between japonica and indica cultivars 
(Additional file 4: Figure S2), the genomes of many 
high-yielding cultivars would be expected to represent 
an admixture of indica and japonica genome types. To 
show the extent of admixture, a structure analysis was 
applied to Japanese high-yielding cultivars and other re- 
ference cultivars with taxon information. For an as- 
sumed number of populations, structure analysis can 
estimate the proportional contribution of each ancestral 
population to each cultivar. 

The structure obtained for K (number of populations) = 
3 seemed to correspond to japonica, indica, and an ad- 
mixture of the two (Figure 2). At /<'=4, the admixture 
group was subdivided into two groups, one representing 
an admixture of indica and japonica cultivars, the other 
containing tropical japonica cultivars. These structures 



were compared to the categories of accessions determined 
by previous studies (Additional file 5: Table S3). The struc- 
ture consisting of four groups {K = 4) best fit the cluste- 
ring results. 

Combination of genomic regions derived from overseas 
and domestic parental cultivars in genomes of Japanese 
high-yielding rice 

Structure and cluster analysis showed that most of the 
overseas parental cultivars were classified into the indica 
group and the domestic parents into the japonica group 
(Figure 2; Additional file 6: Figure S3). To determine the 
genome type most prevalent in each chromosomal re- 
gion, we focused on 649 of the 1152 SNPs. This subset 
was able to discriminate between the overseas indica 
(PO-indica) and domestic japonica (PD) parental mate- 
rials and allowed us to determine the distribution of 
japonica and indica alleles in the genome of Japanese 
high-yielding rice cultivars (Figure 3). The Japanese high- 
yielding cultivars showed different levels of genome-type 
mixing and could be generally classified into three types: 
japonica alleles dominant throughout the genome (type 
JA), indica alleles dominant throughout the genome (type 
IN) and an even mixture of both types (type MX). These 
three types corresponded to the japonica type, indica type 
and admixture type classified in the structure and cluster 
analyses. Although the high-yielding cultivars differed 
widely in genome structure, in certain chromosome 
regions of the mixtures, one genome type or the 
other seemed to dominate; for example, the short arm 
of chromosome 1 and most of chromosomes 11 and 12 
contained predominantly indica alleles (Figure 3). 

We calculated the allele frequency of indica-type al- 
leles at each SNP in Japanese high-yielding, domestic 
and indica parental cultivars (Figure 4). Distinct regions 
dominated by the indica genome type were observed on 
the short arm of chromosome 1, the end of the long arm 
of chromosome 2, the middle of chromosome 7, the 
long arm of chromosome 8 and all of chromosomes 11 
and 12. On the other hand, some regions were domi- 
nated by alleles from domestic parents, i.e., the long arm 



Table 1 Genetic features of 1046 SNPs screened from a subset of 1152 SNP markers for five rice populations 

Adjusted sample size (n = 14) 





Sample size 


MAF 


Heterozygosity 


PIC 


MAF 


Heterozygosity 


PIC 


All 


126 


0.338 


0.010 


0.335 








HY 


35 


0.307 


0.010 


0.318 


0.299 


0.010 


0.307 


PO 


14 


0.264 


0.010 


0.279 


0.264 


0.010 


0.279 


PD 


29 


0.150 


0.008 


0.160 


0.142 


0.008 


0.150 


Indica 


24 


0.106 


0.011 


0.119 


0.104 


0.010 


0.115 


Japonica 


24 


0.198 


0.013 


0.213 


0.196 


0.013 


0.210 


MAF, minor allele frequency; PIC, polymorphism information content. 

HY, high-yielding cultivar; PO, parental cultivar (overseas); PD, parental cultivar (domestic). 
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Figure 2 Structure analysis of 126 rice cultivars using models of two to six ancestral groups. In the upper five parts of the graph, each 
vertical bar represents a single cultivar; values displayed are the estimated population fractions in each cluster. Yellow (orange and red), blue, 
magenta and green indicate genome components from temperate japonica, indica, tropical japonica and high-yielding cultivars, respectively. 
The bottom panel of the figure indicates the category to which each accession belongs (see Additional file 5: Table S3). Colors have the same 
meaning as above. Accessions shown in white are overseas indica (PO-indica) or domestic (PD) parental cultivars. 



of chromosome 1, most of the long arm of chromosome 
2 and the middle of chromosome 6. 

Phenotype annotation for frequently introgressed regions 
of indica or japonica genomes 

We hypothesized that genomic regions of Japanese high- 
yielding cultivars that were dominated by a particular 
genome type (i.e., japonica or indica) might be involved 
in yield-related traits. To test this hypothesis, we per- 
formed association mapping in a population consisting 
of 68 selected lines with different yield-related traits and 
then examined the identified regions for highly skewed 
allele frequencies of either the indica or the japonica 
type. Significant associations (with permutation P < 
0.01) were detected at eight SNP loci for six yield- 
associated traits (Table 2; Additional file 7: Figure S4). 
Associations with surface area of unhusked seeds were 
detected at four loci and associations with five other 
traits (blast susceptibility, heading date, number of 



panicles, 1000-grain weight and length of unhusked 
seeds) were detected at one or two loci each. All of these 
loci were located in genomic regions dominated by 
indica-type alleles (mean allele frequencies of 0.41-0.62) 
except for one locus at 21.84 Mb on chromosome 10, 
which was dominated by the japonica-tYpe allele. A 
locus at 34.80 Mb on chromosome 2 showed correlation 
with surface area of unhusked seeds, length of unhusked 
seeds and 1000-grain weight. Interestingly, the values of 
all traits were smaller with indica-type alleles than with 
japonica-type alleles at this locus. A significant associ- 
ation was observed between surface area of unhusked 
seeds and loci at both 10.15 and 11.63 Mb on chromo- 
some 7 and at 19.61 Mb on chromosome 8. At each of 
the three loci, the indica allele increased the surface 
area. At 10.15 Mb on chromosome 7, the indica allele 
was significantly associated with earlier heading. At 
21.84 Mb on chromosome 10, the indica allele was asso- 
ciated with significantly reduced unhusked seed length. 
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HY, high-yielding cultivar ; PO, parental cultivar (overseas); PD, parental cultivar (domestic); 

iAJoponica alleles dominant throughout the genome; IN, indica alleles dominant throughout the genome 

MX, Japonica and indica alleles equally prevalent 
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Figure 3 Graphical genotypes of Japanese high-yielding (HY), overseas indica [PO-indica) and domestic (PD) parental cultivars based 
on 649 SNPs chosen to discriminate alleles derived from PD and PO-indica cultivars. Each row represents one cultivar. Rows corresponding 
to HY cultivars are arranged in tlie same order (riglit to left) as in Additional file 6: Figure S3. The rows corresponding to PD cultivars are arranged 
in order of cultivar number (Additional file 1: Table SI). The 649 SNPs were chosen as having a difference of major allele frequency between the 
PD and PO-indica groups of >0.7. 
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Figure 4 SNP allele frequency of indica type in a Japanese high-yielding rice population and parental cultivars. To reduce bias caused by 
parental allele frequency, HY allele frequencies were adjusted as described in Methods. Black dashed line and gray shading indicate the median 
and range (25th-75th percentile) of all mean values of HY (adjusted). 



Table 2 SNPs associated with yield-related traits 



















Candidate gene 


Allele effect 






QTL 






Marker' 


Chr 


Position 
(IVIb) 


Traits 


Mean allele 
frequency 
of HY 


-loglO(P) 


Permutation 
p 


.i Position 
Name OsID 

Viviu; 


PO-indica 


PD 


qtl-id'' 


Name 


LOD 


Interval or 
co-segregating 
marker 










Surface area of 
unhusked seeds 




5.12 


0.001 




19.68 ± 1.57 


23.15 ± 3.26 


108 


gw2. 1 


3.25 


RM250-RM208 


Ml 


dduzuujyoy 


Z 




Length of 
unhusked seeds 

1000-grain 
weight 


U.DZ 


5.48 


U.UUj 

0.003 




8.26 ± 0.52 
22.54 ± 2.80 


8.79 ± 0.76 
28.78 ± 494 






^ QQ 


ciAi c^^A^ 

l^/ H/— 1 1 










Heading date 




5.52 


0.001 




115.10 ± 7.62 


128.67 ± 7.14 










NIAS_ 


_Os_ac07000274 


7 


10.15 


Surface area of 
unhusked seeds 


041 


5.67 


0.001 


Ghd7 Xll^ 9.15 


20.57 ± 2.27 


1 8.05 ± 0.92 


- 


- 


- 


- 


NIAS_ 


Os_ab07000535 


7 


11.63 


Surface area of 
unhusked seeds 


047 


4.83 


0.003 




20.53 ± 2.27 


20.53 ± 2.27 










NIAS_ 


O5_ah08001148 


8 


19.61 


Surface area of 
unhusked seeds 


0.51 


5.00 


0.001 




20.55 ± 2.26 


19.37 ± 2.27 










NIAS_ 


_Os_aa 10003574 


10 


21.84 


Length of 
unhusked seeds 


0.21 


4.52 


0.004 




8.26 ± 0.52 


8.51 ± 0.70 


211 




7.98 


RZ811-RG561 




P0943 


11 


8.38 


Number of 
panicles 


0.46 


4.90 


0.004 




218.67 ±30.74 


287.55 ± 45.53 










NIAS_ 


_Os_aa 12004348 


12 


9.10 


Blast 
susceptibility 


0.44 


5.22 


0.009 


„. , Os12g 
P'-'" 0281300 


0.17 ± 0.84 


2.35 ± 2.13 












P0926 


12 


21.22 


Blast 
susceptibility 


0.48 


5.55 


0.002 




0.07 ± 045 


2.26 ± 2.18 


383 


Pi32(t) 


8.65 


C449 



^SNPs identified as having a significant permutation P -value (P < 0.01} detected by genome-wide association mapping in a high-yielding rice population. The SNPs chosen for testing represent frequently introgressed 
regions from japonica or indica parental cultivars. 
^QTL ID from Q-TARO database. 
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At 8.38 Mb on chromosome 11, the indica allele was as- 
sociated with reduced panicle number. At 9.10 and 
21.22 Mb on chromosome 12, the indica allele was asso- 
ciated with increased blast resistance. 

By searching databases of functionally characterized 
genes and quantitative trait loci (QTL) in rice [25,26], 
we identified two genes and four QTLs as candidates in 
these eight regions. Ghd7, which regulates flowering 
time and yield potential [12], was near the SNP at 
10.15 Mb on chromosome 7, and the blast resistance 
gene Pi-ta [27] was near the SNP locus at 9.10 Mb on 
chromosome 12. Information in the QTL database Q- 
TARO [26] suggested that four QTLs for the traits re- 
lated to yield or blast resistance (candidate QTLs) were 
located close to three of the SNP loci (34.80 Mb on 
chromosome 2 [2 QTLs], 21.84 Mb on chromosome 10 
and 21.22 Mb on chromosome 12). 

Classification of Ghd7 and Pi-ta alleles 

To determine whether the allele types of candidate genes 
Ghd7 and Pi-ta were consistent with those of nearby 
SNPs having significant phenotype associations, we sur- 
veyed the relevant genome sequences of five Japanese 
high-yielding cultivars and one overseas parental culti- 
var. Indica- or japonica-s^eciiic alleles have been re- 
ported in previous studies of both Ghd7 [12,28] and Pi- 
ta [27]. We then compared the allele types of the SNPs 
and candidate genes in the six cultivars. In Ghd7, the 
promoter region SNP S_555 and the predicted amino 
acids at four positions (122, 136, 174 and 233) in the five 
Japanese high-yielding cultivars coincided with the allele 
type at the nearest significant SNP locus, NIAS_0- 
s_ac07000274 (Table 3). Four of the five high-yielding 
cultivars (all except Tachiaoba [HY30]) had an indica- 



type allele for the SNP locus 
within Ghd7, as did overseas 
258. For Pi-ta, the predicted 
three positions (148, 158 and 
same allele type as the nearest 
s_aal2004348. Thus, the SNP 
Pi-ta can be used as markers 
type at each of these two loci. 



and at the five positions 
parental cultivar Suweon 
amino acid sequences at 
176) corresponded to the 
significant SNP NIAS_0- 
alleles nearest Ghd7 and 
to discriminate the allele 



Discussion 

Yield is a complicated trait involving multiple compo- 
nent traits such as seedling vigor, photosynthetic rate, 
heading date and others; in turn, these traits are consid- 
ered to be controlled by multiple genes. Therefore, it has 
been difficult to identify any single factor associated with 
increased yield potential in rice. Combining diverse al- 
leles from indica and japonica is one way to produce de- 
sirable genotypes for high-yielding rice. To achieve this, 
cross-breeding has been conducted for over 30 years 
and, in fact, significant increases in yield potential have 
been achieved [13,29]. It is expected that such high- 
yielding cultivars resulted from unique combinations of 
the indica and japonica genomes, but until now, no re- 
port has presented the genome-wide genotypes of such 
high-yielding cultivars. In this study, we selected inform- 
ative SNP sets to visualize the whole-genome genotypes 
of such cultivars and identified various combinations of 
the indica and japonica genomes. Interestingly, the 
Japanese high-yielding rice cultivars could be divided 
into three groups: one dominated by japonica genome 
regions, one dominated by indica genome regions, and 
one containing admixtures. The combination of indica 
and japonica factors appears to have the greater poten- 
tial for increasing yield because the admixture-type 



Table 3 Comparison of SNP and candidate gene allele types in regions associated with significant trait differences 


Chr Gene/marker 


Position Position 


Mizuhochikara 


Takanari 


Hokuriku 193 


Tachisugata 


Tachiaoba 


Suweon 258 




(IRGSPv. 1)Mb (aa) 


HY22 


HY34 


HY9 


HY32 


HY30 


P011 


7 Ghd7 promoter (S.SSS)" 


9.15 


T 


T 


T 


T 


C 


T 


7 Ghdf' 


9.15 122 




G 


G 


G 


E 


G 


1 Ghd/^ 


9.15 136 


S 


S 


S 


S 


G 


S 


7 Ghd/" 


9.15 174 


V 


V 


V 


V 


D 


V 


7 Ghd/' 


9.15 233 


A 


A 


A 


A 


P 


A 


7 NIAS_Os_ac07000274 


10.15 


G 


G 


G 


G 


T 


G 


7 NIAS_Os_ab07000535 


11.63 


C 


C 


C 


C 


C 


C 


12 NIAS_0s_aa1 2004348 


9.10 


G 


G 


G 


T 


T 


G 


12 Pi-td^ 


10.61 148 


R 


R 


R 


S 


S 


R 


1 2 Pi-ta"^ 


10.61 158 


H 


H 


H 


Q 


Q 


H 


12 ft-fa'' 
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"SNP for discrimination of Ghd7 haplotype (Lu ef oi 2012) [28]. 
''Amino acid for discrimination of GhdJ allele (Xue ef oi. 2008) [12]. 
■^Bold characters correspond to allele classified as indica type. 
■^Amino acid for discrimination of Pi-ta allele (Bryan ef al. 2000) [27]. 
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cultivars were most prevalent. Some chromosomal re- 
gions had highly skewed allele frequencies, suggesting 
that regions associated with high-yielding phenotypes 
had been conserved during the process of breeding 
selection. 

GWAS is undoubtedly an effective way to detect 
QTLs, but is not very useful for a population consisting 
of improved cultivars, owing to the population structure. 
When using a population with a strong structure, the 
probability of detecting false-positive associations is 
higher than in a genetically divergent population. In 
spite of this risk, identification of valuable haplotypes 
underlying rice breeding populations is necessary to en- 
able acceleration of the selection process. Here, we used 
the GWAS strategy to associate genomic regions having 
highly skewed indica or japonica allele frequencies with 
yield-related phenotypes. Several previous GWASs 
[15,16,19,24] have shown that chromosomal regions as- 
sociated with several characters, from simple domestica- 
tion traits such as a glutinous phenotype to more 
complex traits such as flowering time and seed size, 
were introgressed from one cultivar group into another. 
In this study, we detected eight genome regions with sig- 
nificant phenotypic associations, two of which were near 
previously reported functionally characterized genes. 
Both the positions and the phenotypes associated with 
the latter two genome regions coincided with the posi- 
tions and phenotypes of Ghd7 and Pi-ta. 

Heading date is an important trait that affects yield 
and plant type (i.e., harvest index). We detected a signifi- 
cant signal associated with heading date at position 
10.15 Mb on chromosome 7, near Ghd7 [12], and the 
type of SNP allele in each of six cultivars coincided with 
the type of Ghd7 allele in that cultivar. The indica and 
japonica alleles found in Japanese high-yielding rice cul- 
tivars correspond to Ghd7-1 and Ghd7-2, respectively. A 
previous study [12] concluded that Ghd7-1 is a fully 
functional allele that confers late heading and that 
Ghd7-2 is a weak functional allele that confers an inter- 
mediate phenotype. However, the allele effects of Ghd7 
in our materials were the opposite of those previously 
reported: strains containing the Ghd7-1 {indica) allele 
flowered earlier than those containing Ghd7-2 (Table 2, 
Additional fUe 8: Figure S5). Other studies have also 
found large variations in heading date among cultivars 
with the same Ghd7 allele [12,28]. These differences 
might be caused by the interaction of Ghd7 with other 
genes controlling heading date. 

Under the current system of rice cultivation, which 
uses paddy fields as efficiently as possible, cultivars with 
short growth duration from seeding to maturing may be 
suitable for obtaining optimal yield [30] . We also observed 
a negative correlation between heading date and surface 
area of unhusked seed (Additional file 8: Figure S5, 



Additional file 9: Table S4). The strong positive relation- 
ship between surface area of unhusked seed and 1000- 
grain weight might indicate that early heading contributes 
to an increase in seed weight per grain (Additional file 9: 
Table S4). Thus, we conclude that early heading associated 
with the indica-Ghd7 allele is important for grain filling in 
Japanese high-yielding rice. 

Pi-ta is one of several rice blast resistance genes, and 
the indica-type allele has been reported to confer resist- 
ance [27]. The indica-type allele found in Japanese high- 
yielding rice cultivars was identical to the corresponding 
resistance alleles of Yashiromochi (Pi-ta) and Tetep [Pi- 
ta2). It has also been suggested that progeny of Suweon 
258 contain Pi-ta or Pi-ta2 [13]. We conclude that the 
frequently introgressed region from indica rice detected 
at position 9.10 Mb of chromosome 12 contains an allele 
of Pi-ta that confers resistance. 

Our analysis showed a second SNP (P0926) on 
chromosome 12 associated with blast susceptibility. Be- 
cause it was located more than 10 Mb from SNP NIA- 
S_Os_aal2004348 (near Pi-ta), other blast resistance 
genes might be located in this region. Studies of the dis- 
tribution of disease resistance loci [31] and nucleotide- 
binding-site genes [32] in the rice genome have shown 
that many loci or genes associated with disease resist- 
ance are located in the middle part of chromosome 12. 
According to our data, parents fi-om overseas were the 
donors of blast resistance genes now found in high- 
yielding rice in Japan. 

In several other chromosomal regions that did not 
contain candidate genes, we detected significant SNPs 
co-located with four putative QTLs for yield traits. The 
reliability of the QTLs registered in Q-TARO [26] is in- 
dicated by LOD values, which were relatively high for 
these QTLs (3.25 to 8.65). Two putative QTLs [33,34] 
near the end of the long arm of chromosome 2 were as- 
sociated with grain weight, as was a SNP in this same re- 
gion, NIAS_Os_aa02003989. Although this region was 
categorized as highly skewed towards indica-type. alleles, 
the japonica-type allele was associated with higher grain 
weight. This direction of allele effect coincided with that 
of a previously identified QTL (QTL-ID# 362) [34]. 
Meanwhile, previous studies [33,34] reported that QTLs 
for other yield traits were also co-localized in the same 
region. Notably, a QTL associated with grain number 
per panicle was detected in two previous QTL studies 
[33,34], and the non-japonica allele identified in this re- 
gion resulted in an increase in grain number per panicle 
[34]. In this genome region, the selection of QTLs for 
traits such as grain number per panicle might have been 
stronger than for grain size and weight. 

A previously reported QTL [35] near the location of 
SNP NIAS_Os_aal0003574 on chromosome 10 was asso- 
ciated with 1000-grain weight. This SNP was associated 
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with the length of unhusked seeds, which were longer 
when the japonica-type allele was present. The direction 
of allele effect was unclear in the previous study, but the 
candidate QTL was very reliable because it was detected 
in two different populations [35]. Therefore, it is possible 
that several QTLs for seed size might be located in this 
region. 

Despite the association of yield-related phenotypes 
(surface area of unhusked seeds and number of panicles) 
with three other SNPs with highly skewed allele frequen- 
cies (Table 2), we could not find any candidate genes or 
QTLs in these regions. Characterization of currently un- 
known QTLs for yield such as these would contribute to 
the development of high-yielding rice. 

We did not detect significant QTLs responsible for ei- 
ther of the direct yield traits (air-dry seed weight and 
air-dry total plant weight) examined in this study. This 
implies that the contribution of many QTLs with small 
effects and/or interaction among several QTLs, possibly 
with large effects, control these traits. However, our 
findings of the skewed allele frequencies in Japanese 
high-yielding rice cultivars and of significant QTLs con- 
trolling other yield-related phenotypes will help to eluci- 
date the complicated genetic mechanisms controlling 
rice yield. 

Conclusions 

Introgression breeding from indica to japonica or from 
japonica to indica is an established strategy for the accu- 
mulation of QTLs and genes controlling high yield while 
avoiding negative effects such as hybrid weakness, which 
is often a barrier to making wide crosses. Our inform- 
ative SNP set is an effective tool for the identification of 
introgressed indica regions in japonica genetic back- 
grounds and vice versa. Additionally, we have demon- 
strated that phenotypic annotation of introgressed 
regions is possible. Future studies leading to additional 
phenotype annotations for introgressed genomic regions 
would accelerate the identification and accumulation of 
QTLs and genes for the development of high-yielding 
rice. 

Methods 

Plant materials and DNA extraction 

A set of 35 Japanese high-yielding rice cultivars and 43 
of their parental cultivars were subjected to SNP geno- 
typing in this study (Additional file 1: Table SI). We 
used 14 core cultivars (Akenohoshi, Akihikari, Hokuriku 
193, Hoshiyutaka, Kanto PL12, Kochihibiki, Lemont, 
Milyang 23, Oochikara, Ooseto, Suweon 258, Tachisu- 
gata, Tainung 67 and Takanari) to identify an appropriate 
SNP set for analysis of genetic architecture in Japanese 
high-yielding cultivars. To validate the SNP set, we 
added 31 accessions from the NIAS world rice core 



collection [36] and 17 accessions from the NIAS Japanese 
rice core collection [37] (Additional file 2: Table S2). Total 
DNA was extracted from young leaves of 10 plants from 
each cultivar by the CTAB method [38]. The total of 5760 
SNPs were derived from three resources: comparisons be- 
tween Nipponbare and Kasalath, Naba or Khau Mac Kho 
(unpublished data); comparisons between Nipponbare and 
Rikuul32 or Eiko [20]; and a comparison among a rice di- 
versity research set [14]. SNP genotyping was carried out 
using the lUumina GoldenGate Bead Array technology 
platform (lUumina Inc., San Diego, CA, USA). For each 
sample, 250 ng of DNA was used. All experimental proce- 
dures for the SNP genotyping followed the manufacturer's 
instructions. 

To develop a SNP set for the analysis of genetic archi- 
tecture of Japanese high-yielding cultivars, we estimated 
LD values as the pairwise between neighboring SNPs 
for 24 sets containing different numbers of SNPs. The 
definition of A was equivalent to that in our previous 
study [22]. The mean complete LD (0 < < 1) was esti- 
mated for different numbers of SNPs randomly chosen 
across the genome (10 times per set size). Finally, we se- 
lected a SNP set consisting of 1152 SNPs based on the 
mean value of complete LD. The 1152 SNPs are listed in 
Additional file 10: Table S5. 

Genetic diversity, structure and phylogenic analysis of 
Japanese high-ylelding rice cultivars 

We genotyped 126 rice cultivars (Additional file 5: Table S3) 
using the set of 1152 SNPs. The following criteria for the 
classification of a non-informative SNP were adopted: (1) 
no information on its genome position (IRGSP v. 1 
[39,40]), (2) heterozygosity or no signals detected in more 
than 5% of the accessions, and (3) an allele frequency of 
<2% (to minimize the risk of genotyping error). Using 
these criteria, we selected a total of 1046 informative SNPs 
and used them for the subsequent analysis. 

Minor allele frequency, heterozygosity and PIC for five 
populations (subsets of the 126 cultivars) were calculated 
by analyzing the data obtained for the 1046 SNPs with 
PowerMarker v. 3.25 software [41]. To estimate the ef- 
fect of sample size on these values, we recalculated each 
value by adjusting the sample size for each population to 
n = 14, corresponding to the sample size of the smallest 
population. The value for each statistic was the mean of 
10 datasets, obtained by randomly picking 14 samples 
from the original population 10 times. 

The population structure of Japanese high-yielding rice 
was analyzed using InStruct software [42] with the ad- 
mixture model. To eliminate false-positive structures 
arising from excess SNPs, we selected 10 SNPs per 
chromosome that were nearly evenly distributed among 
each of chromosomes from the 1046 SNPs. The run- 
length parameters were 5000 burn-in iterations, and 100 
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000 replications per chain after the burn-in period using 
the Markov-chain Monte Carlo method. We used simu- 
lations with K values (number of populations in the 
model) ranging from 2 to 6, with five replications. Each 
structure (Figure 2, top five panels) was compared to the 
graph indicating the category of each accession determined 
by previous studies (Figure 2, bottom panel; Additional 
file 5: Table S3). A phylogenetic tree was constructed 
using the neighbor-joining method to analyze the 126 
cultivars genotyped with 1046 SNP loci; analysis was 
implemented in the MEGA5 program [43]. The clusters 
were characterized by using reference cultivars belong- 
ing to two NIAS rice core collections (Additional file 5: 
Table S3). 

Estimation of allele frequencies differing between indica 
and japonica genome types in Japanese high-yielding rice 
cultivars 

The presence of specific SNP alleles observed in the 
indica and japonica groups made it possible to discrim- 
inate alleles in high-yielding cultivars originating from 
either overseas indica (PO-indica, Additional file 1: 
Table SI) or domestic japonica (PD) parental cultivars. 
A genome type-specific allele was defined as one with a 
difference in allele frequency of greater than 0.7 between 
the two populations. To reduce bias due to parental al- 
lele frequency, an adjusted indica dominant allele fre- 
quency in Japanese high-yielding cultivars (HY) was 
calculated by each of allele frequency of VO-indica and 
PD. The calculation formula is written as adjusted HY = 
HY - PD - (1 - VO-indica). The median and range of 
25th-75th percentiles of adjusted HY were calculated 
for five-SNP windows across the genome. 

Phenotype annotation of genomic regions with highly 
skewed indica or japonica allele frequencies 

To associate phenotype data with regions having skewed 
frequencies of indica or japonica alleles, a test popula- 
tion segregating for these regions is essential. Therefore, 
we used 68 high-yielding breeding lines developed at the 
Institute of Crop Science, National Agriculture and Food 
Research Organization (NICS-NARO). These lines and 
the mean values for 14 yield-related traits are shown in 
Additional file 11: Table S6. Ten of the 14 traits (all ex- 
cept for 3 seed-related traits) were evaluated at NICS- 
NARO in 2009 with two replicates. Two traits related to 
disease susceptibility (blast and bacterial blast) were 
evaluated by an inoculation test with no replicates. Seed- 
related traits were measured by using the Smart Grain 
program [44]. Phenotype was annotated by the mixed 
linear model implemented in the program TASSEL v. 
3.0 [45] with 290 SNP loci. The positions of SNP loci 
having both permutation P < 0.01 for a given trait and 
outside the range of 25th-75th percentiles of adjusted 



HY throughout the genome were used to search for can- 
didate genes and QTLs for that trait. To look for func- 
tionally characterized genes that might explain an 
observed SNP (genome type) allele effect, we searched 
the OGRO database [25] for candidate genes in the same 
trait category and categorized as "natural variation" 
within a 4-Mb region centered on the SNP of interest. 
If no genes were found in this region, we searched the 
Q-TARO database [26] for QTLs harboring the rele- 
vant SNP or within 4 Mb of it. 

Classification of alleles of candidate genes by 
next-generation sequencing 

Genomic DNA from six cultivars (Hokuriku 193, 
Mizuhochikara, Suweon 258, Tachiaoba, Tachisugata and 
Takanari) was extracted by the CTAB method [38]. An 
Illumina HiSeq 2000 sequencer was used to generate 
100-bp paired-end reads (three samples per lane). Reads 
were mapped to the Nipponbare IRGSP v. 1 reference 
genome with BWA software [46], sorted and indexed 
with SAMtools [47]. To improve the raw alignment bin- 
ary forms of SAM (BAMs) for variant calling, we rea- 
ligned indels and recalibrated base quality scores using 
GATK software [48]. Duplicates were identified using 
Picard (http://picard.sourceforge.net). Variants (SNPs 
and indels) were called on each sample individually with 
the SAMtools mpileup algorithm. The filtering threshold 
was set as a quality score of i-2Q. Variants detected in 
candidate genes were compared among the sequences of 
Nipponbare and the six cultivars. 

Availability of supporting data 

Phylogenetic tree shown in Figure S3 (Additional file 6) 
has been deposited in TreeBASE (http://purl.org/phylo/ 
treebase/phylows/study/TB2:S15706). Nucleotide se- 
quence data is available in the DDBJ Sequenced Read 
Archive under the accession numbers DRP002297. 

Additional files 



Additional file 1: Table SI. List of Japanese high-yielding and parental 
rice cultivars used in this study. 

Additional file 2: Table S2. List of WRC (World Rice Collection of NIAS) 
and JRC (Japanese Rice Collection of NIAS) accessions used in this study. 

Additional file 3: Figure SI. Chromosomal distribution of the 1 152 
SNPs selected for this study. Vertical bars represent chromosomes 1 to 12 
(from left to right), and red horizontal bars indicate the locations of SNPs. 

Additional file 4: Figure S2. Pedigree of Japanese high-yielding rice 
cultivars. Pedigree extends from left to right. Blue, overseas parents; 
yellow, domestic parents; green, cultivars from the world rice core collection 
(NIAS); mixed-color, high-yielding rice cultivars used in this study. The labels 
next to some boxes represent the cultivar numbers used in Additional file 1 : 
Table SI and Additional file 2: Table S2. 

Additional file 5: Table S3. Cluster assignment for each cultivar. 

Additional file 6: Figure S3. Figure S3 Phylogenetic tree of 126 rice 
accessions, constructed using the neighbor-joining method to analyze 
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data for 1046 SNP markers. The range of japonica, indica and tropical 
japonica was estimated from reference cultivars belonging to the NIAS 
Japanese and world rice core collections. Red arrows indicate the 
admixture-type of Japanese high-yielding cultivars as defined from the 
structure analysis. Other categories shown here are described in 
Additional file 1: Table SI. Cultivars are listed in Additional file 1: Table SI 
and Additional file 2: Table S2). 

Additional file 7: Figure S4. Manhattan plots of GWAS (MLM) for six 
significant traits in 68 selected lines. The x axis shows the relative position 
on chromosomes 1 to 12, arranged with the short arm of each 
chromosome to the left. The y axis shows - log (P-value) of markers. 
Dashed line shows permutation P = 0.01; thus, points above the line 
represent markers with significant effects. BLAST, blast susceptibility; HD, 
heading date; lOOOGW, 1000-grain weight; SEED AREA, surface area of 
unhusked seed; SEED LENGTH, length of unhusked seed; PANICLE NO., 
number of panicles. 

Additional file 8: Figure S5. Correlation between heading data and 
surface area of unhusked seed. The correlation coefficient (r) was 
calculated for data for 68 high-yielding rice strains (see Additional file 1 1: 
Table S6). ***P< 0.0001. 

Additional file 9: Table S4. Pearson's correlation coefficients among 12 
yield-related traits. 

Additional file 10: Table SS. Core set of 11 52 SNPs used for analysis 
of high-yield cultivars derived from indica and japonica crosses. 

Additional file 11: Table S6. Yield-related traits of 68 high-yielding 
breeding lines developed at NICS-NARO used for phenotypic annotation 
of genome regions with highly skewed indica or japonica allele frequencies. 
All trait values are means of two replicates except for those of the two 
disease susceptibility traits (blast and bacterial blast), which were 
unreplicated. 



Competing interests 

The authors declare that they have no competing interests. 
Authors' contributions 

JY, RM and HK contributed equally to the work, and JY supervised the study. 
JY, HK, TY and IVIY conceived and designed the experiments. RM, HK, EY and 
KE performed the experiments. JY analyzed the data. HK, HH, YT, HT, Tl, HO 
and HM contributed the materials. JY, RM, TY, KM and MY wrote the paper 
All authors read and approved the final manuscript 

Acknowledgements 

We thank the Japanese national and prefectural agricultural experimental 
stations for providing the rice seeds. This work was supported by grants 
from the Ministry of Agriculture, Eorestry and Eisheries of Japan (Genomics 
for Agricultural Innovation, NVR0002 and GIR1003 and Scientific Technique 
Research Promotion Program for Agriculture, Forestry, Eisheries and Eood 
Industry) and from the Program for Promotion of Basic and Applied 
Researches for Innovations in Bio-oriented Industry, Japan. 

Author details 

'National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, 
Ibaraki 305-8602, Japan. ^NARO Institute of Crop Science, 2-1-18 Kannondai, 
Tsukuba, Ibaraki 305-8518, Japan. ^NARO Institute of Vegetable and Tea 
Science, 360 Kusawa, Ano, Tsu, Mie 514-2392, Japan. ''NARO Tohoku 
Agricultural Research Center, 3 Yotsusya, Daisen, Akita 014-0102, Japan. 
^NARO Hokuriku Agricultural Research Center, 1-2-1 Inada, Jyoetsu, Niigata 
943-0193, Japan. 

Received: 29 August 2013 Accepted: 28 April 2014 
Published: 8 May 2014 

References 

1. United Nations, Department of Economic and Social Affairs, Population 
Division: World Population Prospects: The 2010 Revision, Highlights and 
Advance Tables. Working Paper No. ESA/P/WP.220. 

2. Hargrove TR, Cabanilla VL Impact of semi-dwarf varieties on Asian 
rice-breeding programs. Bioscience 1979, 29(1 2):73 1-735. 



3. Khush GS: Green revolution: preparing for the 21 st century. Genome 1999, 
42(4):646-655. 

4. Foster KW, Rutger JN: Inheritance of semidwarfism in rice, Oryza sativa L. 
Genetics 1978, 88(3):559-574. 

5. Fan C Xing Y Mao H, Lu J, Han B, Xu C Li X, Zhang Q: GS3, a major QTL 
for grain length and weight and minor QTL for grain width and 
thickness in rice, encodes a putative transmembrane protein. TheorAppI 
Genet 2006, 1 12(6):1 164-1 1 71. 

6 Weng J, Gu S, Wan X, Gao H, Guo T, Su N, Lei C Zhang X, Cheng Z, Guo X, 
Wang J, Jiang L, Zhai H, Wan J: Isolation and initial characterization of 
GW5, a major QTL associated with rice grain width and weight. Cell Res 
2008, 1 8(1 2):1 199-1209 

7. Shomura A, Izawa T, Ebana K, Fbitani T, Kanegae H, Konishi S, Yano M: 
Deletion in a gene associated with grain size increased yields during rice 
domestication. Nat Genet 2008, 40(8):1023-1028. 

8. Li Y Fan C Xing Y, Jiang Y Luo L, Sun L, Shao D, Xu C Li X, Xiao J, He Y, 
Zhang Q: Natural variation in GS5 plays an important role in regulating 
grain size and yield in rice. Nat Genet 201 1, 43(1 2):1 266-1 269. 

9. Wang S, Wu K, Yuan Q, Liu X, Liu Z, Lin X, Zeng R, Zhu H, Dong G, Qian Q, 
Zhang G, Fu X: Control of grain size, shape and quality by OsSPL16 in 
rice. Nat Genet 2012, 44(8):950-954. 

10. Ishimaru K, Hirotsu N, Madoka Y, Murakami N, Hara N, Onodera H, Kashiwagi 
T, Ujiie K, Shimizu B-i, Onishi A, Miyagawa H, Katoh E: Loss of function of 
the lAA-glucose hydrolase gene TGW6 enhances rice grain weight and 
increases yield. Nat Genet 201 3, 45:707-71 1 . 

11. Wu W, Zheng XM, Lu G, Zhong Z, Gao H, Chen L, Wu C Wang HJ, Wang Q, 
Zhou K Wang JL, Wu F, Zhang X, Guo X, Cheng Z, Lei C, Lin Q, Jiang U 
Wang H, Ge S, Wan J: Association of functional nucleotide 
polymorphisms at DTH2 with the northward expansion of rice 
cultivation in Asia. Proc Natl Acad Scl 20]3, 1 10(8):2775-2780. 

1 2. Xue W, Xing Y, Weng X, Zhao Y, Tang W, Wang L, Zhou H, Yu S, Xu C Li X, 
Zhang Q: Natural variation in Ghd7 is an important regulator of heading 
date and yield potential in rice. Nat Genet 2008, 40(6):76 1-767. 

13. Kato H: Development of rice varieties for whole crop silage (WCS) in 
Japan. Jpn Agr Res Q 2008, 42(4):231-236. 

14. Ebana K, Yonemaru J-i, Fukuoka S, Iwata H, Kanamori H, Namiki N, Nagasaki 
H, Yano M: Genetic structure revealed by a whole-genome single- 
nucleotide polymorphism survey of diverse accessions of cultivated 
Asian rice (Oryza sativa L). Breed Sci 2010, 60(4):390-397. 

15. Zhao K, Wright M, Kimball J, Eizenga G, McClung A, Kovach M, Tyagi W, All 
ML, Tung CW, Reynolds A, Bustamante CD, McCouch SR: Genomic diversity 
and introgression in 0. sativa reveal the impact of domestication and 
breeding on the rice genome. PLoS One 2010, 5(5):el0780. 

16. Zhao K, Tung CW, Fizenga GC Wright MH, Ali ML, Price AH, Norton GJ, 
Islam MR, Reynolds A, Mezey J, McClung AM, Bustamante CD, McCouch SR: 
Genome-wide association mapping reveals a rich genetic architecture of 
complex traits in Oryza sativa. Nat Commun 2011, 2:467. 

17. Xu X, Liu X, Ge S, Jensen JD, Hu F, Li X, Dong Y, Gutenkunst RN, Fang L, 
Huang L, Li J, He W, Zhang G, Zheng X, Zhang F, Li Y, Yu C, Kristiansen K, 
Zhang X, Wang J, Wright M, McCouch S, Nielsen R, Wang W: Resequencing 
50 accessions of cultivated and wild rice yields markers for identifying 
agronomically important genes. Nat Biotechnol 2012, 30(1 ):1 05-1 11. 

1 8. Subbaiyan GK, Waters DL, Katiyar SK Sadananda AR Vaddadi S, Henry RJ: 
Genome-wide DMA polymorphisms in elite ir)dica rice inbreds discovered 
by whole-genome sequencing. Plant Biotechnol J 201 2, 10(6):623-634. 

19. Huang X, Kurata N, Wei X, Wang Z-X, Wang A, Zhao Q, Zhao Y Liu K, Lu H, 
Li W, Guo Y, Lu Y, Zhou C Fan D, Weng Q, Zhu C Huang J, Zhang U Wang 
Y, Feng L, Furuumi H, Kubo T, Miyabayashi T, Yuan X, Xu Q, Dong G, Zhan 
Q, Li C, Fujiyama A, Toyoda A, et at A map of rice genome variation 
reveals the origin of cultivated rice. Nature 2012, 490:497-501. 

20. Yamamoto T, Nagasaki H, Yonemaru Jl, Ebana K, Nakajima M, Shibaya T, 
Yano M: Fine definition of the pedigree haplotypes of closely related rice 
cultivars by means of genome-wide discovery of single-nucleotide 
polymorphisms. BMC Genomics 2010, 11(1):267. 

21 . Chen H, He H, Zou Y, Chen W, Yu R Liu X, Yang Y Gao YM, Xu JL, Fan LM, 
Li Y, Li ZK, Deng XW: Development and application of a set of breeder- 
friendly SNP markers for genetic analyses and molecular breeding of rice 
(Oryza sativa L.). Theor Appi Genet 201 1, 123(6):869-879. 

22. Yonemaru J-i, Yamamoto T, Ebana K, Yamamoto E, Nagasaki H, Shibaya T, 
Yano M: Genome-wide haplotype changes produced by artificial selection 
during modern rice breeding in Japan. PLoS One 2012, 7(3):e32982. 



Yonemaru ef al. BMC Genomics 2014, 15:346 
http://www.biomedcentral.com/1471-2164/15/346 



Page 12 of 12 



23. 



24. 



25. 



26. 



27. 



28. 



29. 



30. 
31. 



32 



33 



34. 



35. 



36. 



37. 



38. 



39. 



40. 



42. 



43. 



44. 



Abe A, Kosugi S, Yoshida K, Natsume S, Takagi H, Kanzaki H, Matsumura H, 
Mitsuoka C, Tamiru M, Innan H, Cano L, Kamoun S, Terauchi R: Genome 
sequencing reveals agronomically important loci in rice using MutMap. 

Nat Biotechnol 201 2, 30(2):1 74-1 78. 

Huang X, Zhao Y, Wei X, Li C, Wang A, Zhao Q, Li W, Guo Y, Deng L, Zhu C, 
Fan D, Lu Y, Weng Q, Liu K, Zhou T, Jing Y 5i L, Dong G, Huang T, Lu T, 
Feng Q, Qian Q, Li J, Han B: Genome-wide association study of flowering 
time and grain yield traits in a worldwide collection of rice germplasm. 

Nat Genet 201 1, 44(l):32-39. 

Yamamoto E, Yonemaru J-i, Yamamoto T, Yano M: OGRO: The Overview of 
functionally characterized Genes in Rice Online database. Rice 2012, 
5(1):26. 

Yonemaru J-i, Yamamoto T, Fukuoka S, Uga Y Hori K, Yano M: Q-TARO: QTL 
Annotation Rice Online database. Rice 2010, 3:194-203. 
Bryan GT, Wu KS, Farrall L, Jia Y Hershey HP, McAdams SA, Faulk KN, 
Donaldson GK, Tarchini R, Valent B: A single amino acid difference 
distinguishes resistant and susceptible alleles of the rice blast resistance 
gene Pi-ta. Piant Celi 2000, 12(11):2033-2046. 

Lu L, Yan W, Xue W, Shao D, Xing Y: Evolution and association analysis of 
Ghd7 in rice. PLoS One 2012, 7(5):e34021. 

Sun J, Liu D, Wang JY Ma DR, Tang L, Gao H, Xu ZJ, Chen WF: The 
contribution of intersubspecific hybridization to the breeding of super- 
high-yielding japonica rice in northeast China. TtieorAppi Genet 2012, 
125{6):1149-1157. 

Khush G: Breaking the yield frontier of rice. GeoJoumai 1 995, 35(3)329-332. 
Wisser RJ, Sun Q, Hulbert SH, Kresovich S, Nelson RJ: Identification and 
characterization of regions of the rice genome associated with broad- 
spectrum, quantitative disease resistance. Genetics 2005, 

169(4):2277-2293. 

Zhou T, Wang Y, Chen JQ, Araki H, Jing Z, Jiang K, Shen J, Tian D: Genome- 
wide identification of NBS genes in japonica rice reveals significant 
expansion of divergent non-TIR NBS-LRR genes. Mol Genet Genomics 
2004, 271(4):402-415. 

Marri PR, Saria N, Reddy LV, Siddiq EA: Identification and mapping of yield 
and yield related QTLs from an Indian accession of Oryza rufipogon. BMC 

Genet 2005, 6(1)33 

Ishimaru K: Identification of a locus increasing rice yield and 
physiological analysis of its function. Plant Pliysiol 2003 133(3):1 083-1 090. 
Zhuang JY, Fan YY, Wu JL, Xia YW, Zheng KL: Comparison of the detection 
of QTL for yield traits in different generations of a rice cross using two 
mapping approaches. Yi Chuan Xue Bao 2001, 28(5)458-464. 
Kojima Y, Ebana K, Fukuoka S, Nagamine T, Kawase M: Development of an 
RFLP-based rice diversity research set of germplasm. Breed Sci 2005, 
55(4):43 1-440. 

Ebana K, Kojima Y, Fukuoka S, Nagamine T, Kawase M: Development of mini 
core collection of Japanese rice landrace. Breed Sci 2008, 58(3):281-291. 
Murray MG, Thompson WF: Rapid isolation of high molecular weight 
plant DNA. Nucleic Acids Res 1980, 8(1 9):4321 -4325. 
Kawahara Y, de la Bastide M, Hamilton J, Kanamori H, McCombie W, 
Ouyang S, Schwartz D, Tanaka T, Wu J, Zhou S, Childs K, Davidson R, Lin H, 
Quesada-Ocampo L, Vaillancourt B, Sakai H, Lee S, Kim J, Numa H, Itoh T, 
Buell C, Matsumoto T: Improvement of the Oryza sativa Nipponbare 
reference genome using next generation sequence and optical map 
data. R/ce 2013, 6(1);4 

Sakai H, Lee S5, Tanaka T, Numa H, Kim J, Kawahara Y, Wakimoto H, Yang 
CC, Iwamoto IVl, Abe T, Yamada Y, Muto A, inokuchi H, ikemura T, 
Matsumoto T, Sasaki T, Itoh T: Rice Annotation Project Database (RAP-DB): 
An integrative and interactive database for rice genomics. Plant Cell 
Pliysiol 2013, 54(2):e5. 

Liu K, Muse SV: PowerMarker: an integrated analysis environment for 
genetic marker analysis. Bioinformatics 2005, 21(9):2128-2129. 
Gao H, Williamson S, Bustamante CD: A Markov chain Monte Carlo 
approach for joint inference of population structure and inbreeding 
rates from multilocus genotype data. Genetics 2007, 176(3):1635-1651. 
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: 
molecular evolutionary genetics analysis using maximum likelihood, 
evolutionary distance, and maximum parsimony methods. Mol Biol Evol 
2011,28(10):2731-2739. 

Tanabata T, Shibaya T, Hori K, Ebana K, Yano M: SmartGrain: high- 
throughput phenotyping software for measuring seed shape through 
image analysis. Plant Physiol 2012, 160(4):1 871 -1880. 



45. 



46. 



47. 



Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES: 
TASSEL: software for association mapping of complex traits in diverse 
samples. Bioinformatics 2007, 23(1 9):2633-2635. 
Li H, Durbin R: Fast and accurate short read alignment with Burrows- 
Wheeler transform. Bioinformatics 2009, 25(14):! 754-1 760. 
Li H, Handsaker B, Wysoker A, Fennell T Ruan J, Homer N, Marth G, Abecasis 
G, Durbin R: The sequence alignment/map format and SAMtools. 
Bioinformatics 2009, 25(16):2078-2079. 

DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, HartI C, Philippakis 
AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, 
Sivachenko AY, Cibulskis K Gabriel SB, Altshuler D, Daly MJ: A framework 
for variation discovery and genotyping using next-generation DNA 
sequencing data. Nat Genet 201 1, 43(5):491-498. 



doi:1 0.1 1 86/1 471 -21 64-1 5-346 

Cite this article as: Yonemaru et al:. Genomic regions involved in yield 
potential detected by genome-wide association analysis in Japanese 
high-yielding rice cultivars. BMC Genomics 2014 15:346. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 



Submit your manuscript at 
www.blomedcentral.com/submit 



o 



BioMed Central 



